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Abstract 

Background: gp340, a member of scavenger receptor cysteine rich family encoded by Deleted in Malignant Brain 
Tumors 1 (DMBTl), is an important component in innate immune defense. The first scavenger receptor cysteine rich 
domain (SRCRl) of gp340 has been shown to inhibit HIV-1 infection through binding to the N-terminal flank of the 
V3 loop of HIV-1 gpl20. 

Results: Through homology modeling and docking analysis of SRCRl to a gpl20-CD4-X5 antibody complex, we 
identified three loop regions containing polar or acidic residues that directly interacted with gpl20. To confirm the 
docking prediction, a series of over-lapping peptides covering the SRCRl sequence were synthesized and analyzed 
by gpl20-peptide binding assay. Five peptides coincide with three loop regions showed the relative high binding 
index. An alanine substitution scan revealed that Asp34, Asp35, Asn96 and GlulOl in two peptides with the highest 
binding index are the critical residues in SRCRl interaction with gpl20. 

Conclusion: We pinpointed the vital gpl20-binding regions in SRCRl and narrowed down the amino acids which 
play critical roles in contacting with gpl20. 

Keywords: SRCRl, DMBTl, HIV-1 gpl2a Automated docking 



Background 

Salivary agglutinin (SAG) and lung scavenger receptor 
glycoprotein (gp340) are multi-splicing products of 
Deleted in Malignant Brain Tumors 1 (DMBTl), a gene 
originally identified in brain tumors [1]. SAG, secreted 
by salivary glands, was originally found to aggregate oral 
streptococcus gordonii and streptococcus mutans, playing 
important roles in dental caries [2]. Gp340 can also bind 
a number of Gram-negative and Gram-positive bacteria, 
including E. coli, Lactobacillus casei, Helicobacter pylori, 
S. aureus, S, pneumoniae, and Haemophilus, Influenzae 
[3-5]. Recently, gp340 was shown to inhibit cytoinvasion 
of Salmonella enterica in intestinal epithelial cells [6] 
and the infectivity of influenza A virus [7]. Previous 
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studies showed that soluble gp340 specifically inhibited 
HIV-1 infection [8,9], by interacting with the viral enve- 
lope gpl20 [10]. Subsequent studies identified that a 
short sequence located at the N-terminal flank of the 
gpl20 V3 loop interacted with gp340 [11]. An N-SRCR 
recombinant protein containing the first SRCR domain 
and one-half of the first SID of gp340 was shown to bind 
to the N-terminal flank of the V3 loop of gpl20 and in- 
hibit both CXCR4 and CCR5 HIV-1 isolates, exhibiting 
similar properties as the parental gp340 [12]. Although 
the bioactivities of gp340 as an innate immune compo- 
nent have been well documented [13], the detailed 
molecular mechanisms remain largely unclear. Since the 
SRCRl binds to a highly conserved region of HIV-1 
gpl20 [12], it is a potential candidate as an anti- viral 
inhibitor. 

SRCR consists of 90-110 amino acids and is divided 
into Groups A and B, depending on the number of 
disulfide bonds and the locations of the cysteines [13]. 
Group B SRCR, to which gp340 belongs, contains either 
6 or 8 cysteines with cysteines at positions 1 and 4 
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always present while Group A SRCR contains 6 cysteines 
with positions 1 and 4 always absent. Since SRCR forms 
a highly structured domain and it is likely that structural 
elements are part of the gpl20 binding domain, it is 
imperative to determine the sequence or minimal struc- 
tural domain needed for mediating gpl20 binding. The 
structure of Mac 2 binding protein (M2bp), a Group A 
SRCR, has been determined by crystallography [14]. In the 
current study, a 3-D structure of SRCRl was constructed 
through homology modeling based on the known struc- 
ture of M2bp, and was docked with the published 3D 
structure of gpl20-CD4-X5 complex, which is a gpl20 
core in complex with soluble CD4 and a HIV-1 neutraliz- 
ing antibody-X5 Fab fragment [15]. A probability analysis 
was conducted on the multiple docking models [16] and 
the gpl20-contact amino acid residues on SRCR were 
predicted. To verify the docking analysis, linear peptides 
were synthesized according to the prediction values and 
gpl20-peptide binding analysis was conducted. The results 
demonstrated the validity of the modeling and identified a 
number of regions that were in contact with gpl20. This 
study provided atomic insight into the interactions between 
gpl20 and SRCRl, shedding light on the inhibitory mech- 
anism of SRCR and laid the ground for further study on 
the SRCR sequence/structures important for HIV-1 inhib- 
ition and the design of new HIV-1 target drugs. 

Methods 

Modeling of SRCR1 domain 

The structure of SRCRl (residues 95-203 on DMBTl, 
renumbered as 1-109 in this article) was built based on the 
known structure of M2bp SRCR domain (PDB ID: 1BY2) 
using software MODELLER9v4 [14,17-19]. Structural 
refinements were accomplished by energy minimization 
and molecular dynamics using software Insight II 2001 
(Accelrys, San Diego, CA). 



Automated docking of SRCRl to monomeric gp120-CD4-X5 
complex 

Docking of SRCRl to the gpl20-CD4-X5 complex (PDB 
ID: 2B4C) was performed using ZDOCK3.0.1, a rigid- 
body protein-protein docking software [20]. ZDOCK used 
a fast Fourier transformation to search all possible binding 
modes for the proteins, performing evaluation based on 
shape complementarity, desolvation energy, and elec- 
trostatics. The top predictions from ZDOCK were then 
recomputed by RDOCK to improve the energies and eli- 
minate clashes. 

Peptide design, synthesis and purification 

A complete set of 15-mer peptides, overlapping by 10 
amino acids, was derived from SRCRl. These peptides 
were numbered according to the SRCRl sequence, PI 
standing for 1-15 residues, P2 for 5-20 and so on. To 
study the role of the vital amino acids for gpl20 binding, a 
set of peptides was introduced alanine substitutions in 
corresponding residues. All peptides were synthesized and 
purified by high-performance liquid chromatography 
(HPLC) to >95% purity (HD Biosciences Co., Shanghai, 
China). The biotin tag was labeled at the N-terminus of 
the peptide. 

Peptide binding assay (solid-phase ELISA) 

Recombinant gpl20s were purchased from Immuno 
Diagnostics, Inc (Woburn, MA, USA) or from the NIH 
AIDS Reagent Program (Bethesda, MD, USA). 96-well 
polyvinylchloride plates (Corning, NY, USA) were coated 
with 50 \A of 8ug/ml gpl20 diluted in 50 mM bicarbon- 
ate buffer, pH 9.6, and incubated at 4°C for 16 hours. 
Unbound protein was removed by repeated washing with 
20 mM Tris-HCl, pH7.4, containing 0.05% Tween-20 
(washing buffer). Nonspecific sites were blocked with 
200 (il 2% nonfat milk dissolved in washing buffer, at 37°C 




Figure 1 Two reverse views of SRCRl structure. (3-strands are in cyan; two a-helices are in magenta. Disulfide bridges are in yellow and the 
sequence numbers of cysteine residues are indicated. 
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Figure 2 Docking of SRCRl witli monomeric gp120-CD4-X5 complex. A) The structure of the docked SRCRl. SRCRl and the gpl20 core 
(in blue) are shown in solid ribbon style. CD4 and X5 are represented in silk ribbon style with orange (CD4), green (X5 light chain) and grey 
(X5 heavy chain), respectively; B) Electrostatic surface of SRCRl (left) and the V3-loop of gpl20 (right). The molecular surfaces of both SRCRl and 
the V3-loop of gpl20 are colored according to the calculated electrostatic surface potential with negative charges in blue, neutral in white, and 
positive charges in red; C) The closer view of the gpl20-binding sites. The residues in SRCRl, which make contact with gpl20, are shown as 
sticks in grey, whereas the residues in gpl20 are colored in blue; D) The docking frequency of residues 1-109 in SRCRl. Three high docking 
frequency loop regions 33-36, 73-76 and 94-97 are indicated. 



for 60 min. 50 [A of 30 (ig/ml biotinylated peptides diluted 
in washing buffer were added and allowed to bind gpl20 
at 37°C for 60 min. Unbound peptides were removed by 
repeated washing. For competition studies, biotinylated 
P19 (40 (ig/ml) was mixed with increasing concentrations 
of non-bio tinylated PI 9, added to a gpl20-coated plate 
(8ug/ml) and incubated at 37°C for 60 min. Unbound pep- 
tides were removed by repeated washing. Since PI showed 
no binding to gpl20 in the previous binding analyses, it 
was used as a control at the maximum concentration 
(640 (ig/ml). The bound peptides were detected with AP- 
conjugated streptavidin (Vector, CA, USA) and PNPP 
liquid substrate (Sigma, USA), and measured at 405 nm. 



Results and discussion 

Structural modeling of SRCRl 

DMBTl SRCRl and M2bp SRCR shared high homology 
(50%) in primary sequence. The SRCRl 3D structure was 
organized around a curved four-stranded |3-sheet cradling 
two a-helices (Figure 1). There were 4 disulfide bridges in 
SRCRl: Cys33 and Cys97 forming a disulfide bridge that 
linked the C-terminus of the |33 strand to the a2 -helix; the 
Cys46-Cysl07 disulfide bridge linking the al-helix to the 
underlying |3-sheet; the Cys77-Cys87 disulfide bridge cir- 
cularizing the loop and containing a turn around Glu81; 
and the Cysl7-Cys51 disulfide bridge linking the al-helix 
to the |3-turn between the |3l and |32 strands. 
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Figure 3 Gp120 interaction with SRCRl peptides. A) The schematic illustration of SRCRl sequence. Twenty overlapping 15-mer peptides 
covering the complete SRCRl sequence were designed and synthesized. These peptides were numbered according to SRCRl sequence, PI 
standing for 1-15 residues, P2 for 5-20 and so on. The peptides with highest binding index with gpl20 were identified and the key binding 
amino acids were marked with star; B) Recombinant gpl20BaL interaction with peptides derived from SRCRl. Biotinylated peptides (30 ng/ml) 
was allowed to bind immobilized gpl20 (8 |jg/ml) and were tested for their binding capability to gpl20 with ELISA; C) Nonbiotinylated PI 9 
competitively inhibited biotinylated PI 9 binding to gpl20 (8 |jg/ml). Biotinylated PI 9 (40 pg/ml) was pre-incubated with increasing 
concentrations of non-biotinylated PI 9, added to a gpl20-coated plate (8ug/ml) and incubated at 37°C for 60 min. Unbound peptides were 
removed by repeated washing. Since PI showed no binding to gpl20 in previous binding assays, it was used as a control at the maximum 
concentration (640 pg/ml). 



Comparing to M2bp SRCR, the SRCRl has an additional 
disulfide bridge Cysl7-Cys51 while the corresponding 
M2bp residues, AsnlS and Phe49, are in close proximity 
with a Ca-Ca distance of 7.5 A [14]. Hohenester et al. 
postulated that the 1-4 disulfide bridge (Cysl7-Cys51) in 
group B domains (SRCRl) would link the C-terminus of 
the a-helix (al) to the A-B loop (|3l-|32) and packed favor- 
ably against the 3-8 bridge (Cys46-Cysl07) [14]. The 
disulfide bond Cysl7-Cys51 was therefore inserted in the 
structure according to their hypothesis (Figure 1). The 
surface loop structure with a |3-turn, located between |3l 
and |32 strands, was previously shown to be involved in 
bacteria binding [4]. 



Table 1 SRCR1 peptides binding to gp120 



Peptide 


Amino acid 
sequence 


Residue of 
SRCRl sequence 


Relative 
binding 


P5 


VEILYRGSWGWCDD 


21-35 


+4 


P6 


RGSWGWCDDSWDTN 


26-40 


+4.5 


P14 


GSGPIALDDVRCSGH 


66-80 


+ 1.5 


P15 


ALDDVRCSGHESYLW 


71-85 


+2 


P19 


GWLSHNCGHGEDAGV 


91-105 


+7.5 


P20 


NCGHGEDAGVICSA- 


96-109 


+ 1.5 



Docking of SRCRl with monomeric gp120-CD4-X5 complex 

Our early study demonstrated that gp340 binding to 
gpl20 was significantly enhanced by the pre- treatment 
of gpl20 with soluble CD4 and that the 17b antibody 
binding to the gpl20-sCD4 complex did not abrogate 
gp340 binding and vice versa [11]. We suggested that 
the gp340-binding region on gpl20 was occluded or par- 
tially occluded on the native gpl20 and sCD4 binding 
induced the exposure of the region, thus allowing high 
affinity interaction between gp340 and gpl20 [11]. 
Therefore, we speculated that a gpl20 in complex with 
sCD4 would be a favorable model for our docking 
analysis. Since the gpl20-sCD4-17b structure does not 
contain a V3 sequence [21], a newly published crystal 
structure of a V3-containing HIV-1 gpl20 core in com- 
plex with sCD4 and a HIV-1 neutralizing antibody-X5 
[15] was selected. X5 and 17b belong to the same class 
of HIV-1 neutralizing antibodies termed CD4-induced 
(CD4i) antibodies, which recognize highly conserved 
epitopes exposed upon CD4 binding. The binding of the 
CD4i antibodies to gpl20 is typically enhanced by the 
CD4 binding. To provide restrains on docking possi- 
bilities, a number of factors were considered. First, the 
location of the V3 loop and the surface for SRCRl 
binding on the trimeric viral spike in native state, as 
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SRCRl Peptide 



Figure 4 Determination of critical residues in P6 and PI 9 peptides. A) Alanine substitution scan of tine peptides. Biotinylated peptides 
(30 |jg/ml) were allowed to bind immobilized gpl20BaL (8 mq/i^O- P6/34 means that Asp34 in P6 was substituted by alanine. It was same to other 
peptides, such as P6/35, PI 9/96, PI 9/1 01 and PI 9/96/1 01; PI was used as a negative control; B) The structural view of P6 and PI 9 in SRCRl 
model. The P6 was colored in cyan and PI 9 was colored in purple. Asp34, Asp35, Asn96 and GlulOl residues were labeled on the regions of P6 
and P19. 



determined by Liu et al [22], were used to orient the 
SRCRl structure in the docking. Second, gp340 is a 
macromolecule with 14 SRCR domains in tandem order, 
thus SRCRl can only be docked to a region outside of 
the plane formed by the |3-sheet on the V3 loop due to 
the steric hindrance. Figure 2A showed a typically 
docked complex of SRCRl and gpl20-sCD4-X5. The 
SRCR structure interacted with gpl20 on the open face, 
near the V3 loop site [11], and the N- and C-termini of 
SRCR, which link the SRCRl subunit to the rest of the 
gp340, were pointed away from the gpl20. 

Analysis of gp120-contacting amino acids on SRCR1 

The docked complex of SRCRl and gpl20-CD4-X5 was 
shown in Figure 2A. Three regions with apparently high 
docking frequencies were identified (Figure 2D): the sur- 
face Loopl between the |33 strand and al-helix (Cys33 
to Ser36), the surface Loop2 near the Cys77-Cys87 disul- 
fide bridge (Asp73 to Arg76) and the surface Loop3 near 
the a2-helix (Ser94 to Cys97). These three loops formed 
a negatively charged cluster on the SRCRl 3D structure 
and a number of amino acids contacted with the V3- 
loop of gpl20 as shown in Figure 2B. Figure 2C showed 
a closer view of gpl20-binding sites on SRCRl. Fre- 
quently, Asp35 and Asp73 formed salt bridges with 
Arg304 of gpl20. Ser36 formed a Van der Waals inter- 
action or a hydrogen bond with Thr320. Asp74 formed a 
hydrogen bond with Tyr318 and Thr320. Arg76, the only 
basic residue in the gpl20-binding sites, frequently 
formed a salt bridge with Glu322 of the V3 loop. Asn96 
showed the highest binding frequency among contact 
residues in SRCRl (Figure 2D), and always made contact 
with Lys305, Arg304, Ser306 and Tyr318. Ser94 and 
His95 frequently interacted with Tyr318. Docking ana- 
lysis indicated that the frequency of gpl20 interaction 



with the residues 93-97 in SRCRl was higher than that 
of any other residues, suggesting that this region might 
be the most important for the stability of the interaction. 

xThe amino acid contacting frequency analysis was 
consistent with published studies [11]. Earlier studies in- 
dicated that the contribution of the charges of V3 resi- 
dues dominated gpl20 electrostatic potential [23] and a 
minimal gp340-binding sequence was localized to the 
V3 N-terminal flank (aa303-306), which was rich in 
positively charged residues [11]. A V3 peptide containing 
the residues 303-306 was found to bind both soluble 
[11] and cell-associated gp340 [24], and inhibited gpl20- 
gp340 interaction. In addition, our model also suggested 
that Phe317, Tyr318, Thr320 and Glu322 at the C- 
terminus of the V3 were involved in gpl20-SRCR inter- 
action, though ELISA failed to show such an interaction 
[11]. Cys33, Asp34, Asp35, Asp73, Asp74, Arg76, His95, 
Asn96, and Cys97 were conserved in all 13 SRCRs of 
gp340, indicating that they might be indispensable for 
the bioactivity of gp340 and that the other SRCRs might 
also bind gpl20. In some SRCR domains, Ser36 and 
Ser94 were replaced by Thr and Tyr, respectively, which 
did not significantly change the polarity of the residues. 
Therefore, we speculated that inter-molecular salt brid- 
ges and polar interactions were conserved and important 
for the specificity of the interaction. 

Gp120 interaction with SRCRl peptides 

To validate the predictions of the docking analysis and to 
characterize the gpl20-binding region, a series of linear 
peptides spanning SRCRl sequence were synthesized and 
analyzed for their interactions with gpl20 in a solid-phase 
ELISA. The sequence of SRCRl was showed in Figure 3A 
and the peptides with high binding index were identified 
according to their location in the sequence of SRCRl. 
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Figure 3B showed that there were a number of over-lapping 
peptides interacting with gpl20 and with P5, P6 and P19 
showing the highest binding indexes. The sequence shared 
by P5 and P6 was RGSWGTVCDD, which included the 
high docking frequency residues Cys33, Asp34 and Asp35 
within the Loop 1. PI 9, containing the high docking fre- 
quency residues Asn96 and GlulOl in Loop 3, exhibited the 
highest binding index. P14 and P15, with an overlapping se- 
quence ALDDVRCSGH and high docking frequency resi- 
dues Asp73, Asp74 and Arg76, also interacted with gpl20. 
The results were consistent with the computer modeling 
and summarized in Table 1. 

To demonstrate the specificity of the interaction, 
biotinylated P19 binding to gpl20 was competed with a 
nonbiotinylated P19 in an ELISA. Figure 3C showed that 
increasing concentrations of nonbiotinylated P19 reduced 
the biotinylated P19 binding to gpl20 in a dose-dependent 
manner while the control PI had no effect, suggesting that 
the P19-gpl20 interaction was sequence specific and not 
mediated by the biotin. 

To further confirm the docking frequency, we intro- 
duced alanine substitutions into four residues with the 
highest docking frequency in two peptides (Asp34 and 
Asp35 in P6, Gln96 and GlulOl in P19), either individu- 
ally or in combination. All the mutant peptides showed 
reduced binding with gpl20 (Figure 4A). However, the 
P19 binding to gpl20 was minimally affected by muta- 
tions at 96 and 101, respectively, or in combination. We 
think that this may reflect the discrepancy between the 
modeling and the peptide-based binding assay. The 
modeling was performed on a folded SRCRl with three 
surface loop regions directly interacting with gpl20. 
Asn96 and GlulOl, both located in the interface of the 
SRCRl-gpl20 docking complex (seeing Figures 2B and 
4B), were thus predicted to be important for protein 
interaction. However, these two amino acids in P19 may 
not be in the best conformation to interact with gpl20 
and some other amino acids that are not located in the 
interface may be available in P 19 to interact with gpl20. 

By using a serial of peptides based on a consensus SRCR 
sequence, a bioactive peptide (SRCRP2) was shown to 
bind streptococcal and mediated bacterial aggregation [4]. 
Interestingly, the sequence of SRCRP2 coincides with the 
sequences of P5 and P6. Thus we hypothesized that the 
same region of SRCRl contributed to the binding of both 
the bacterial and HIV-1. However, P5 and P6 didn't show 
significant inhibitory effect in an in vitro HIV-1 pseu- 
dovirus inhibitory assay (data not shown). We also discov- 
ered that disulfide bond-disrupted SRCRl protein had 
significantly reduced binding to gpl20 (data not shown). 
In view of the tightly folded SRCR domain, we speculated 
that a certain conformational element is required for 
SRCRl to form high affinity interaction with gpl20 
and to inhibit HIV-1 infection, consistent with the current 



observations that the gpl20-contacting amino acids were 
distributed in the distal sequence. Based upon our model- 
ing, P6 and P19 were linked with a disulfide bridge formed 
by Cys33 and Cys97. Thus, distal residues (Asp34, Asp35, 
Asn96 and GlulOl) along the linear sequence were 
brought to proximity by disulfide bonds and formed a 
gpl20 binding plain (Figure 4B). The structural element 
required for SRCRl to form high affinity binding with 
gpl20 and for inhibiting viral infection needs further in- 
vestigation. The identification of the contacting amino 
acid residues in the current study will facilitate our under- 
standing of the structure and the design of molecular 
mimicry expressing the needed conformation as a more 
effective drug. 

Conclusions 

We performed docking analysis on gpl20-SRCRl inter- 
action and made a number of predictions on the SRCRl 
amino acid residues that may interact with gpl20. Bind- 
ing analysis using a series of synthetic peptides derived 
from SRCRl demonstrated that three high binding loop 
regions of aa33-36, 73-76 and 94-97 on SRCRl bound 
gpl20. The relative binding indexes were consistent with 
the predictions of the docking model, suggesting that 
our model is appropriate for further analysis of SRCRl- 
gpl20 interaction and that, by combining biochemical 
approaches, the model may facilitate the designing of 
new HIV-1 target drugs. 
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